This year, a report from Centers for Disease Control and Prevention revealed that America’s obesity rate has reached a record high. Contrasting to popular belief, New Yorkers are not so safe from the obesity epidemic, as more than half of adult New Yorkers are either overweight or obese. Studies show that the rise in the obesity epidemic is partly due to disparities in food environment; it is harder for some to eat healthier because their options are limited.
For example, I and my fellow classamate often go out to McDonald that are near our home for convenience, even though these restaurants were not healthy. Environment can affect our health behaviours imperceptibly.
This project intends to look deeper into the relationship between food environment in NYC and obesity rate along with diabetes rate and stroke hospitalization rate.
1. Can the percentage of nation-wide chain and fast-food restaurants in different neighborhood of NYC be related to their obesity, diabetes and stroke hospitalization rate ?
2.Can the percentage of fast-food restaurant (defined by cuisine description in the restaurant inspection dataset) in different neighborhood of NYC be related to their obesity, diabetes and stroke hospitalization rate ?
3.Can the percentage of health inspection grade A restaurant in different neighborhood of NYC be related to their obesity, diabetes and stroke hospitalization rate?
For the first three questions, we stick to it. Since the percentage of health inspection grade A restaurant is not a significant predictor, we later put it into the model as a confounder. Moreover, we deleted the fourth problem. Composite restaurant health score were to subjective and difficult to assess. Moreover, we calculated the boro level model first due to the difficulties we encountered in scaping zipcode-neighborhood data. However, after some struggle, we still managed to get the neighborhood information and analyze the data accordingly.
1. NYC restaurant inspection:
https://data.cityofnewyork.us/Health/DOHMH-New-York-City-Restaurant-Inspection-Results/43nn-pn8j
This dataset contains the data for restaurant inspection in NYC from August 1, 2014 to December 5th, 2017. Every row is a restaurant inspection record that includes the name of the restaurant, zipcode, cuisine type description, inspection grade (A as the best grade) and so on. Since this dataset has the inspection data for 3 years, it is almost certained that it has the data for all the restuarants in NYC in 2014.
Variable used in this datasets
DBA: Name of the restaurant
BORO: name of the boro
ZIPCODE: the zipcode of the restaurant
CUISINE DESCRIPTION: the kind of food that the restaurant is providing
GRADE: Inspection grade for the specific inspection.
2. 2015 Community Health Profiles Open Data:
https://www1.nyc.gov/site/doh/data/data-publications/profiles.page
This dataset contains NYC every neighborhoods’ demographic (percentage of white race, poverty percentage), health (age-adjusted percent of adult exercised in the last 30 days, age-adjusted percent of adults as a smoker) and our main outcome (Age-adjusted percent of adults that is obese (BMI of 30 or gthe reater), ge-adjusted percent of adults, Age-adjusted rate of hospitalizations due to stroke (cerebrovascular disease) per 100,000 adults)
Variable used in this datasets
Name: the name for the neigborhood.
Racewhite_Rate: the percentage for white race
Poverty: Percent of individuals living below the federal poverty threshold
Smoking: age-adjusted percent of adults as a smoker Exercise: Age-adjusted percent of adults that reported getting any exercise in the last 30 days
Obesity: Age-adjusted percent of adults that is obese (BMI of 30 or greater) based on self-reported height and weight
Diabetes: Age-adjusted percent of adults that had ever been told by a healthcare professional that they have diabetes
Stroke_Hosp: Age-adjusted rate of hospitalizations due to stroke (cerebrovascular disease) per 100,000 adults
3. Web scraping for what zipcodes each neighborhood contains
Because Community Health Profiles Open Data is has only neighborhood level information, so we have to aggregate neighborhood level information from the restaurant inspeciton data. However, the restaurant inspection data was using zipcodes for every restaurant. Therefore, we had to add the neighborhood name to every restaurant, so that we can group by neighborhood then calculate the percentage for every neighborhood.
First, we scraped the table data from https://www.health.ny.gov/statistics/cancer/registry/appendix/neighborhoods.htm. However the neighborhood name and combinations were different from the health profile data we downloaded. We cannot merge the two datasets. Then we tried to look for the raw classification for the neigborhoods in health data from New York University’s Furman Center for Real Estate and Urban Policy and the NYC Department of City Planning. However, still no luck becuase the neighborhood was not divided by zipcode areas. Then we tried the hardest way: search the name of the neighborhood on Google and finding the matching zipcodes. It worked out well, but it was not precise, because neighborhoods were not divided according to the area of the zipcode. Different neighborhoods have the same zipcodes.
In order to be unbiased in this process, if two neighborhoods contains the same zipcode area, we will randomly assign this zipcode to only one neighborhood.
Downloading restaurant inspection data from NYC open data:
get_all_inspections = function(url) {
all_inspections = vector("list", length = 0)
loop_index = 1
chunk_size = 50000
DO_NEXT = TRUE
while (DO_NEXT) {
message("Getting data, page ", loop_index)
all_inspections[[loop_index]] =
GET(url,
query = list(`$order` = "zipcode",
`$limit` = chunk_size,
`$offset` = as.integer((loop_index - 1) * chunk_size)
)
) %>%
content("text") %>%
fromJSON() %>%
as_tibble()
DO_NEXT = dim(all_inspections[[loop_index]])[1] == chunk_size
loop_index = loop_index + 1
}
all_inspections
}
url = "https://data.cityofnewyork.us/resource/9w7m-hzhe.json"
rest_inspection = get_all_inspections(url) %>%
bind_rows()
Downloading health and demographic data CSV into local folder and reading, cleaning it:
download.file("https://www1.nyc.gov/assets/doh/downloads/excel/episrv/2015_CHP_PUD.xlsx", mode="wb", destfile = "health.xlsx")
health <- read_excel("health.xlsx", sheet = "CHP_all_data") %>%
select(Name, Racewhite_Rate, Poverty, Unemployment,
Smoking, Exercise,
Obesity, Diabetes, Stroke_Hosp) %>%
clean_names()
Matching the neighborhood names with restaurant zipcodes:
zip_neighbor <- read_csv("neigh_zipcode.csv") %>%
mutate(zipcode = as.character(zipcode))
##restaurant data with neighbourhood
rest_neighborhood = left_join(rest_inspection, zip_neighbor, by = "zipcode") %>%
filter(!is.na(neighborhood))
health_only_neighborhood <- health[-c(1:6),] %>%
rename(neighborhood = name) %>%
mutate(neighborhood = as.factor(neighborhood))
##Plotting for outcome in different neighborhood
health_only_neighborhood %>%
mutate(neighborhood = fct_reorder(neighborhood, obesity)) %>%
filter(obesity > 25) %>%
ggplot(aes(x = neighborhood,y = obesity, fill = neighborhood)) + geom_col() +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
legend.position = "none") +
labs(title = "Neighborhood with 25 percent more obesity rate")
health_only_neighborhood %>%
mutate(neighborhood = fct_reorder(neighborhood, diabetes)) %>%
filter(diabetes > 10) %>%
ggplot(aes(x = neighborhood,y = diabetes, fill = neighborhood)) + geom_col() +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
legend.position = "none") +
labs(title = "Neighborhood with 10 percent more diabetes rate")
health_only_neighborhood %>%
mutate(neighborhood = fct_reorder(neighborhood, stroke_hosp)) %>%
filter(stroke_hosp > 300) %>%
ggplot(aes(x = neighborhood,y = stroke_hosp, fill = neighborhood)) +
geom_col() +
theme(axis.text.x = element_text(angle = 45, hjust = 1),
legend.position = "none") +
labs(title = "Neighborhood with 300 more stroke hospitalization in 100,000 adults")
Summary: Williamsbridge and Baychester have the highest obesity rate, up to 32%. East New York and Starrett City have the highest diabetes rate, up to 17 percent. BUshwick has the highest stroke hospitalization in 100,000 adults, up to 300 adults. Morrisania and Crotona has the highest percent of poverty rate, up to 40%. Fiancial district has the highest percent of young peoeple, up to 67%.
# Identify fastfood restaurants by their cuisine types. We print out the cuisine type list (n=85) and let everyone circle the ones they think is fastfood and we use the union as our rule (use union because it's more conservative).
rest_inspection %>%
mutate(dba = str_to_upper(dba)) %>%
arrange(cuisine_description) %>%
count(cuisine_description)
# calculating the total number of restaurants and the number of fastfood restaurants in the neighborhood, as well as the percentage of fastfood restaurants.
neighborhood_list =
rest_neighborhood %>%
distinct(neighborhood) %>%
arrange(neighborhood)
rest_fastfood_neighborhood =
rest_neighborhood %>%
filter(cuisine_description %in% c("Bagels/Pretzels",
"Bottled beverages, including water, sodas, juices, etc.",
"Chicken",
"Delicatessen",
"Donuts",
"Hamburgers",
"Hotdogs",
"Hotdogs/Pretzels",
"Ice Cream, Gelato, Yogurt, Ices",
"Nuts/Confectionary",
"Pancakes/Waffles",
"Pizza",
"Soul Food",
"Sandwiches",
"Sandwiches/Salads/Mixed Buffet",
"Soups & Sandwiches"))
percent_fastfood_neighborhood = function(name_neighborhood){
rest_each_neighborhood =
rest_neighborhood %>%
filter(neighborhood == name_neighborhood) %>%
distinct(camis)
n_rest_neighborhood = nrow(rest_each_neighborhood)
rest_fastfood_distinct_neighborhood =
rest_fastfood_neighborhood %>%
filter(neighborhood == name_neighborhood) %>%
distinct(camis, cuisine_description)
n_fastfood_neighborhood = nrow(rest_fastfood_distinct_neighborhood)
percent_fastfood_neighborhood = n_fastfood_neighborhood/n_rest_neighborhood
tibble(
neighborhood = name_neighborhood,
n_fastfood = n_fastfood_neighborhood,
n_rest = n_rest_neighborhood,
percent_fastfood = percent_fastfood_neighborhood
)
}
fastfood_neighborhood =
map(neighborhood_list$neighborhood, percent_fastfood_neighborhood) %>%
bind_rows() %>%
mutate(neighborhood = str_to_upper(neighborhood))
fastfood_neighborhood %>%
mutate(neighborhood = as.factor(neighborhood),
n_rest = as.numeric(n_rest),
n_nonfastfood = n_rest - n_fastfood,
neighborhood = fct_reorder(neighborhood, percent_fastfood)) %>%
plot_ly(., x = ~neighborhood, y = ~n_fastfood, type = 'bar', name = 'fastfood restaurants') %>%
add_trace(y = ~n_nonfastfood, name = 'non-fastfood restaurants') %>%
layout(yaxis = list(title = 'number of restaurants'),
xaxis = list(title = 'neighborhood (ordered by percentage of fastfood restaurants)',
tickangle = 45),
barmode = 'stack')
From the plot, we can see that, while the Greenwich Village and SOHO neighborhood has fairly large number of restaurants, it has the smallest percentage of fastfood restaurants. Williamsbridge and Baychester has the largest percentage of fastfood restaurants.
When large number of total restaurants is not equal to large percentage of fastfood restaurants in that neighborhood, we can conclude that the distribution of fastfood restaurants is not even across neighborhoods, which also implies the motivation of our study, we want to investigate if this uneven distribution of fastfood restaurants is associated with diffferent level of chronic disease outcomes within a neighborhood.
## Matching list of chain restaurants in U.S. and restaurant inspection data
chains_html = read_html("https://en.wikipedia.org/wiki/List_of_restaurant_chains_in_the_United_States#Fast-casual")
chain_rest = chains_html %>%
html_nodes("ul:nth-child(9) li , .jquery-tablesorter tr:nth-child(1) td:nth-child(1)") %>%
html_text() %>%
as.tibble() %>%
mutate(dba = value,
dba = str_to_upper(dba)) %>%
select(dba)
# read in the list of chain restaurants in us
# made the names to uppercase and changed the var name to dba
head(chain_rest, 10)
## # A tibble: 10 x 1
## dba
## <chr>
## 1 A&W RESTAURANTS
## 2 APPLEBEE'S
## 3 BAJA FRESH
## 4 BOSTON MARKET
## 5 BUFFALO WILD WINGS
## 6 CHILI'S
## 7 CHIPOTLE MEXICAN GRILL
## 8 CICI'S PIZZA
## 9 COLD STONE CREAMERY
## 10 CORNER BAKERY CAFE
I have scraped list of 75 national chain restaurants from the wikipedia page (https://en.wikipedia.org/wiki/List_of_restaurant_chains_in_the_United_States#Fast-casual) now I will join this dataset with restaurant inspection data to choose only the chain restaurants in NYC.
# removing punctuations in chain_rest & restaurant inspections (neighborhoods)
chain_rest_str =
chain_rest %>%
mutate(dba = str_replace_all(dba, "[[:punct:]]", ""))
rest_neigh_str = rest_neighborhood %>%
mutate(dba = str_replace_all(dba, "[[:punct:]]", ""))
# Matching the two datasets(restaurant inspection data that has all punctuation removed from dba(restaurant name) and list of chain restaurants by dba)
neighborhood_chain =
right_join(rest_neigh_str, chain_rest_str) %>%
filter(!is.na(camis)) %>%
distinct(camis, dba, neighborhood, boro)
## Joining, by = "dba"
neighborhood_chain %>%
group_by(dba) %>%
summarise(n = n()) %>%
arrange(desc(n)) %>%
head(10)
## # A tibble: 10 x 2
## dba n
## <chr> <int>
## 1 DUNKIN DONUTS 453
## 2 SUBWAY 364
## 3 STARBUCKS 307
## 4 MCDONALDS 216
## 5 CHIPOTLE MEXICAN GRILL 78
## 6 WENDYS 46
## 7 APPLEBEES 28
## 8 BOSTON MARKET 24
## 9 WHITE CASTLE 23
## 10 PIZZA HUT 21
The combined dataframe, nyc_chain has nrow(nyc_chain) observations. Also, there were 33 different chain restaurants extracted. The chart is showing the 10 chain restaurants in descending order.
# counting chains in neighborhoods
neigh_count_chain = neighborhood_chain %>%
group_by(neighborhood, boro) %>%
summarise(chain_n = n())
neigh_count_rest = rest_neighborhood %>%
distinct(neighborhood, camis) %>%
group_by(neighborhood) %>%
summarise(res_n = n())
# calculating percentage of chains in each boro
percent_neighborhood_chain = left_join(neigh_count_chain, neigh_count_rest) %>%
ungroup() %>%
mutate(chain_percentage = chain_n/res_n,
neighborhood = str_to_upper(neighborhood))
## Joining, by = "neighborhood"
plot_chain_neighbor = percent_neighborhood_chain %>%
mutate(neighborhood = forcats::fct_reorder(neighborhood, chain_percentage)) %>%
ggplot(aes(neighborhood, chain_percentage, fill = boro)) + geom_bar(stat="identity")
ggplotly(plot_chain_neighbor)
## We recommend that you use the dev version of ggplot2 with `ggplotly()`
## Install it with: `devtools::install_github('hadley/ggplot2')`
I have plotted neighborhoods of NYC against their percentage of chain restaurants and grouped them by borough. We see that all the neighborhodds that had a percentage of chain restuarants less than 5% were all part of Brooklyn except Greenwhich village and Soho of Manhattan. Neighborhoods in Queens and Manhattan are spread out across low to high percentage of chain restaurants while most of neighborhoods in Bronx and Staten Island have high percentages. The neighbor hood that had the highest percentage of chain restaurants were Throgs neck and Co-op City of Bronx with 16.5% of chain restaurats out of all restaurats.
gradea_neighborhood =
rest_neighborhood %>%
group_by(neighborhood, grade) %>%
summarise(n = n()) %>%
mutate(grade_percent = n / sum(n)) %>%
filter(grade == "A") %>%
ungroup(boro) %>%
mutate(neighborhood = str_to_upper(neighborhood))
gradea_neighborhood %>%
mutate(neighborhood = as.factor(neighborhood),
neighborhood = fct_reorder(neighborhood,grade_percent)) %>%
plot_ly(x = ~neighborhood, y = ~grade_percent, color = ~neighborhood, type = "bar")
## Warning in RColorBrewer::brewer.pal(N, "Set2"): n too large, allowed maximum for palette Set2 is 8
## Returning the palette you asked for with that many colors
Different from the situation in borough, the difference of percentage of “grade-A” restaurants in each borough exists. “Throgs Neck and Co-op City” has the greatest grade-A restaurants percentage, around 44.50%. “Sunset Park”, however, has the least, around 30.62%. “Washington Heights”, where we live, takes the fourth from the end, 34.39%, which is obviously consistent with the feeling we have towards the restaurant condition of “Washington Heights”.
health_neighborhood =
health %>%
mutate(neighborhood = str_to_upper(name)) %>%
select(-name)
combined_chain =
percent_neighborhood_chain %>%
select(neighborhood, chain_percentage)
combined_chain_fastfood =
fastfood_neighborhood %>%
mutate(fastfood_percent = percent_fastfood) %>%
select(neighborhood, fastfood_percent) %>%
right_join(combined_chain, by = "neighborhood")
combined_chain_fastfood_gradea =
gradea_neighborhood %>%
select(neighborhood, grade_percent) %>%
right_join(combined_chain_fastfood, by = "neighborhood")
combined_model =
left_join(combined_chain_fastfood_gradea, health_neighborhood, by = "neighborhood")
lm(obesity ~ fastfood_percent + grade_percent + racewhite_rate + poverty + smoking + exercise, combined_model) %>%
broom::tidy() %>%
kable()
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | 31.9211243 | 15.3984162 | 2.0730135 | 0.0431433 |
| fastfood_percent | 59.4287281 | 16.4442064 | 3.6139615 | 0.0006802 |
| grade_percent | -28.2875894 | 27.9231907 | -1.0130500 | 0.3157269 |
| racewhite_rate | -0.0925686 | 0.0428847 | -2.1585476 | 0.0355225 |
| poverty | 0.0514464 | 0.0956373 | 0.5379318 | 0.5929192 |
| smoking | 0.7632678 | 0.2462693 | 3.0993214 | 0.0031276 |
| exercise | -0.2324972 | 0.1439111 | -1.6155610 | 0.1122412 |
lm(obesity ~ chain_percentage + grade_percent + racewhite_rate + poverty + smoking + exercise, combined_model) %>%
broom::tidy() %>%
kable()
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | 26.9238921 | 16.1267974 | 1.6695126 | 0.1010260 |
| chain_percentage | 68.2439998 | 26.3482159 | 2.5900805 | 0.0124195 |
| grade_percent | -6.1900297 | 27.7492092 | -0.2230705 | 0.8243546 |
| racewhite_rate | -0.1308782 | 0.0418674 | -3.1260190 | 0.0028985 |
| poverty | 0.1357973 | 0.1065868 | 1.2740539 | 0.2083092 |
| smoking | 0.8302436 | 0.2575427 | 3.2237124 | 0.0021876 |
| exercise | -0.2204302 | 0.1532812 | -1.4380771 | 0.1564036 |
lm(obesity ~ fastfood_percent + chain_percentage + grade_percent + racewhite_rate + poverty + smoking + exercise, combined_model) %>%
broom::tidy() %>%
kable()
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | 31.4461097 | 15.5523111 | 2.0219573 | 0.0484418 |
| fastfood_percent | 52.8234591 | 22.0338610 | 2.3973764 | 0.0202136 |
| chain_percentage | 15.2539278 | 33.5366773 | 0.4548431 | 0.6511524 |
| grade_percent | -29.7598701 | 28.3241688 | -1.0506882 | 0.2983551 |
| racewhite_rate | -0.0920098 | 0.0432330 | -2.1282332 | 0.0381711 |
| poverty | 0.0710215 | 0.1055479 | 0.6728842 | 0.5040586 |
| smoking | 0.7603044 | 0.2482547 | 3.0625984 | 0.0034987 |
| exercise | -0.2223368 | 0.1467317 | -1.5152612 | 0.1358801 |
We originally have two main predictors of interest, percentage of chain restaurants and percentage of fastfood restaurants, and they are both significantly associated with the three chronic disease health oucomes when solely in the model after adjusting for other potential confounders. Taking obesity as example, the p-value for fastfood_percent is 0.0006801903 and for chain_percent is 0.012419514.
We will need to make decision on which to keep as the final main predictor. So we put the two predictors in the same model and
lm(diabetes ~ fastfood_percent + grade_percent + racewhite_rate + poverty + smoking + exercise, combined_model) %>%
broom::tidy() %>%
kable()
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | 17.1764761 | 6.4782224 | 2.6514181 | 0.0105992 |
| fastfood_percent | 16.9191098 | 6.9181937 | 2.4455964 | 0.0178828 |
| grade_percent | 1.6974935 | 11.7474835 | 0.1444985 | 0.8856655 |
| racewhite_rate | -0.0787451 | 0.0180419 | -4.3645736 | 0.0000607 |
| poverty | 0.0483703 | 0.0402353 | 1.2021846 | 0.2347381 |
| smoking | 0.1523610 | 0.1036073 | 1.4705630 | 0.1474349 |
| exercise | -0.1449503 | 0.0605444 | -2.3941147 | 0.0203017 |
lm(diabetes ~ fastfood_percent + racewhite_rate + poverty + smoking + exercise, combined_model) %>%
broom::tidy() %>%
kable()
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | 17.6786699 | 5.4163283 | 3.263958 | 0.0019270 |
| fastfood_percent | 17.4843100 | 5.6533466 | 3.092736 | 0.0031611 |
| racewhite_rate | -0.0778464 | 0.0167787 | -4.639603 | 0.0000233 |
| poverty | 0.0469398 | 0.0386366 | 1.214906 | 0.2297869 |
| smoking | 0.1529457 | 0.1025675 | 1.491172 | 0.1418447 |
| exercise | -0.1443873 | 0.0598582 | -2.412154 | 0.0193531 |
lm(obesity ~ fastfood_percent + grade_percent + racewhite_rate + poverty + smoking + exercise, combined_model) %>%
broom::tidy() %>%
kable()
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | 31.9211243 | 15.3984162 | 2.0730135 | 0.0431433 |
| fastfood_percent | 59.4287281 | 16.4442064 | 3.6139615 | 0.0006802 |
| grade_percent | -28.2875894 | 27.9231907 | -1.0130500 | 0.3157269 |
| racewhite_rate | -0.0925686 | 0.0428847 | -2.1585476 | 0.0355225 |
| poverty | 0.0514464 | 0.0956373 | 0.5379318 | 0.5929192 |
| smoking | 0.7632678 | 0.2462693 | 3.0993214 | 0.0031276 |
| exercise | -0.2324972 | 0.1439111 | -1.6155610 | 0.1122412 |
lm(diabetes ~ fastfood_percent + racewhite_rate + poverty + smoking + exercise, combined_model) %>%
broom::tidy() %>%
kable()
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | 17.6786699 | 5.4163283 | 3.263958 | 0.0019270 |
| fastfood_percent | 17.4843100 | 5.6533466 | 3.092736 | 0.0031611 |
| racewhite_rate | -0.0778464 | 0.0167787 | -4.639603 | 0.0000233 |
| poverty | 0.0469398 | 0.0386366 | 1.214906 | 0.2297869 |
| smoking | 0.1529457 | 0.1025675 | 1.491172 | 0.1418447 |
| exercise | -0.1443873 | 0.0598582 | -2.412154 | 0.0193531 |
lm(stroke_hosp ~ fastfood_percent + grade_percent + racewhite_rate + poverty + smoking + exercise, combined_model) %>%
broom::tidy() %>%
kable()
| term | estimate | std.error | statistic | p.value |
|---|---|---|---|---|
| (Intercept) | 128.733029 | 159.5106855 | 0.8070496 | 0.4233141 |
| fastfood_percent | 573.540649 | 170.3439236 | 3.3669569 | 0.0014362 |
| grade_percent | -414.861424 | 289.2535987 | -1.4342481 | 0.1574881 |
| racewhite_rate | -1.373925 | 0.4442382 | -3.0927657 | 0.0031864 |
| poverty | 1.564361 | 0.9906978 | 1.5790491 | 0.1203893 |
| smoking | 6.603847 | 2.5510800 | 2.5886475 | 0.0124653 |
| exercise | 1.909576 | 1.4907614 | 1.2809397 | 0.2058984 |
We have three final models each having a health outcome (prevalence of obesity, prevalence of diabetes, and stroke hospitalization rate) as the dependent variable and percentage of fastfood as the main predictor. Obesity and percentage of fastfood restuarants model gave the most significant results with p-value of the main predictor being 0.0005364 (<0.01). In other words, at significance level of 1%, every 0.1 percent increase in fastfood restaurants in a neighborhood is associated with 5% increase in the neighborhood’s obesity prevalence, while adjusting for other factors.
The p-value of the regression coefficients for model with outcomes as diabetes/stroke were both arround 0.003 (<0.01), indicating there is a significant linear relationship between diabetese/stroke and percentage of fastfood restaurants at 1% significance level.
It is reasonable that among the three health outcomes, obesity has the strongest linear association with food environment, in this case, percentage of fastfood restaurants. It is a health condition that has the most direct relation with one’s diet. Moreover, it has a higher prevalence than diabetes and stroke, which could lead to having a lower p-value than the other two health conditions
write
write